## CS3021/3421 Tutorial 4

You will need to use the DLX/MIPS animation to help you answer these questions (http://www.cs.tcd.ie/Jeremy.Jones/vivio/vivio.htm).

Q1. The figure below shows the internal data paths of the DLX/MIPS processor.



For each sub-question below, give a short code segment which shows the specified data paths(s) being used.

- (i) O1 to MUX6
- (ii) O0 to MUX7 and O1 to MUX6 (simultaneously)
- (iii) O0 to MUX8
- (iv) EX to MUX7
- (v) Data cache to MUX9 (memory data-out)
- (vi) O0 to Zero detector
- (vii) Register File to MUX1
- (viii) Branch Target Buffer to MUX1

NB: It is possible to right click on a Vivio animation, select "copy as an enhanced metafile" and paste the image into a Word document.

Q2. Consider the execution of the following code segment (initially r1 = 1 and r2 = 2):

```
add
          r1, r1, r2
                          ; r1 = r1 + r2
add
          r2, r1, r2
                          ; r2 = r1 + r2
add
          r1, r1, r2
                          ; r1 = r1 + r2
add
          r2, r1, r2
                          ; r2 = r1 + r2
add
          r1, r1, r2
                          ; r1 = r1 + r2
halt
```

Determine the resulting value of r1 and the number of clock cycles needed to execute the code segment if (i) ALU Forwarding is enabled (ii) ALU forwarding is disabled and CPU data dependency interlocks are enabled and (iii) ALU forwarding and CPU data dependency interlocks are disabled. Explain in detail why the results and number of clock cycles are different.

Note you can store this program in the DLX/MIPS program database by clicking on "save configuration". Save under "CS3021/your username/q2" e.g. "CS3021/jones/q2".

- Q3. Click "Instruction Cache" until the program shown in Q1 is displayed.
  - (i) What does the program do?
  - (ii) How many instructions are executed and how many clock cycles are needed to execute this program until it halts? Explain <u>in detail</u> why these two numbers are not equal and account for each stall cycle.
  - (iii) Click "Branch Prediction" until "Branch Interlock" is displayed. How many cycles are now needed to execute this program until it halts? Explain in detail why this number differs from your answer to part (ii).
  - (iv) What is the effect on execution time if the two shift instructions are swapped and why?